Estimation of Intensity-Duration-Frequency (IDF) Curve through L-Moment method (Chiavari hourly dataset)

Emanuele Cordano

2023-06-23

Intensity-Duration-Frequency Curve (IDF)

Intensity-duration-frequency (IDF) curves usefully quantify extreme precipitation over various duration, e.g. from 1 to 360 hours, and return periods for engineering design, e.g. the interest in the time concentration or time-structure of the rainfall. (e.g. Courty et al. (2019),Sun et al. (2019),Markiewicz (2021)) Among the different mathematical formulations used in literature (CCC), the most commomt and used here in thiss package is the following (Koutsoyiannis, Kozonis, and Manetas (1998)): \[ I = a(T) D^{n} \] where \(I\) is precipitation intensity [mm/hr or mm/day] , \(D\) is the duration of the extreme precipitation (storm) event, and \(a(T)\) is a coefficient depending on return period \(T\) or probability/frequency \(f\) (where \(f=1-{1 /over T\) where studying storm and flood events) and \(n\) is an exponent assumed to be constant. The key concepts behind this theory are they:

Therefore annual maxima precipitation are collected from daily or hourly time series; for each duration a probability distributions (e.g. a GEV distribution (Wikipedia contributors (2022))) is fitted through a Maximum Likelihood Method or a L-Moment Method; goodness-of-fit is verified with a statistical test (e.g. Kolgorov-Smirnov test (Marsaglia, Tsang, and Wang (2003))); some moments and/or parameter are scaled with duration through a regression (e.g. a log-LM model); a probability distribution with new moments’ and parameters’ values is estimated for each duration; it can be tested versus original maxima time series. This package uses L-Moment method (Hosking (2019),Cordano (2022)) for the fit of the probability distribution. Recently, Heidari et al. (2020) extend also the concepts of IDF curves to droughts. This aspect can be used in this package but requires further investigations.

An application with “chiavari” internal dataset

Here is an application of use of the package with ‘chiavari’ (see dataset documentation)(Museo Scientifico G. Sanguineti - G. Leonardini (n.d.),ARPAL (n.d.)):

data(chiavari)
help(chiavari)

library(lmomIDF) 

data(chiavari) 
library(dplyr)
library(magrittr)
# years <- 1937:2020
# gsod_rds <- "/home/ecor/local/rpackages/jrc/lmomIDF/inst/ext_data/gsod_623180-99999.rds"
# cond <- file.exists(gsod_rds) 
# if (!cond) {
#   gsod <- get_GSOD(years=years,station="623180-99999") ##ALEXANDRIA INTL EG 623180-99999
#   saveRDS(gsod,file=gsod_rds)  
# } else {
#   gsod <- readRDS(file=gsod_rds)
# }

#####
#####

prec <- chiavari %>% dplyr::select(time,value,centralina,granularity_hr) ## %>%
## mutate(YEARMODA=as.Date(YEARMODA,format="%Y-%m-%d")) 

knitr::kable(rbind(head(prec),tail(prec)))
time value centralina granularity_hr
1 1883-12-01 22:00:00 0 t 24
2 1883-12-01 23:00:00 0 t 24
3 1883-12-02 00:00:00 0 t 24
4 1883-12-02 01:00:00 0 t 24
5 1883-12-02 02:00:00 0 t 24
6 1883-12-02 03:00:00 0 t 24
1165070 2022-12-31 20:00:00 0 arpal 1
1165071 2022-12-31 21:00:00 0 arpal 1
1165072 2022-12-31 22:00:00 0 arpal 1
1165073 2022-12-31 23:00:00 0 arpal 1
1165074 2023-01-01 00:00:00 0 arpal 1
1165075 2023-01-01 01:00:00 0 arpal 1

### Dataset preparation 
## insertion of rows so that all instants are equidistant (i.e. there is a constant time step) 
## inserting any NA values of precipitation intensity


dds <- range(prec$time)
## See GSODR documentation
yymmddhhs <- seq(from=dds[1],to=dds[2],by="hour")
prec <- data.table::data.table(time=yymmddhhs) %>% full_join(prec)
#> Joining with `by = join_by(time)`
prec$value[is.na(prec$value)] <- 0
## TO COMPLETE 
##
prec <- prec ##%>% group_by(time,centralina) %>% summarize(value=max(value,na.rm=TRUE),granularity_hr=max(granularity_hr,na.rm=TUE)) %>% ungroup()  

Precipitation Time Series

Time series can be visualized as follows:


library(zoo)
library(dygraphs)

time <- prec$time
precz <- prec$value %>% as.zoo()
index(precz) <- time
main="PRECIPITATION Vs TIME"
ylab="precipitation intensity [mm/hr]"
xlab="time"

dyd <- dygraph(precz,main=main,ylab=ylab,xlab=xlab) %>% dyRangeSelector()
dyd

Annual Maxima for fixed duration

Annual maxima with duration from 1 to 5 days are extracted:


library(lmomIDF)
library(lubridate)

dd <- c(1,3,6,12,24,48)
ann.maxima <- annual.agg(x=prec$value,time=prec$time,aggr.fun=max,dd=dd)
### values cannot be -Inf or +Inf
ann.maxima$aggr[ann.maxima$aggr==-Inf] <- NA
###
str(ann.maxima)
#> tibble [846 × 3] (S3: tbl_df/tbl/data.frame)
#>  $ dd   : num [1:846] 1 1 1 1 1 1 1 1 1 1 ...
#>  $ index: num [1:846] 1883 1884 1885 1886 1887 ...
#>  $ aggr : num [1:846] 4.4 46.5 34.9 112 56.9 ...
###
ann.maxima.z <- ann.maxima %>% reshape2::dcast(index ~ dd)
#> Using aggr as value column: use value.var to override.
time <- as.numeric(ann.maxima.z$index)
ann.maxima.z <- ann.maxima.z %>% dplyr::select(-index) %>% as.zoo()
time1 <- as.Date("1980-01-01")
year(time1) <- time
index(ann.maxima.z) <- time1 


main="MAXIMUM PRECIPITATION INTENSITY Vs TIME"
ylab="precipitation intensity [mm/hr]"
xlab="time"

dyda <- dygraph(ann.maxima.z,main=main,ylab=ylab,xlab=xlab) %>% dyRangeSelector()
dyda

Study Period: 1981-2010

L-moment estimation

Assuming a reference period from 1981 to 2010, L-moments are computed as follows (Hosking (2019)):


x <- ann.maxima %>% filter(as.numeric(index) %in% 1980:2010) 
lmom <- annual.agg.samlmu(x)

head(lmom)
#>           l_1       l_2        t_3        t_4 dd
#> X1  35.474194 8.3288175 0.21666472 0.23604112  1
#> X3  16.975269 3.7754840 0.14669513 0.05290687  3
#> X6   9.768280 2.1002510 0.19916932 0.07219986  6
#> X12  5.841129 1.2498387 0.22621979 0.09989340 12
#> X24  3.438306 0.7018369 0.17008574 0.07571238 24
#> X48  2.199966 0.4375381 0.07309974 0.05402941 48

GEV fitting

Through L-Moments , GEV probability distribution parameter were estimated:


para <- annual.agg.pel(distrib="gev",x=x,lmom=lmom)

para$D001
#>          xi       alpha           k 
#> 28.16276789 11.19859660 -0.07143726 
#> attr(,"probability_distrib")
#> [1] "gev"
#> attr(,"dd")
#> D001 
#>    1 
#> attr(,"ks.test")
#> 
#>  Asymptotic one-sample Kolmogorov-Smirnov test
#> 
#> data:  xval
#> D = 0.093463, p-value = 0.9494
#> alternative hypothesis: two-sided
attr(para,"lmom")
#>           l_1       l_2        t_3        t_4 dd   p.value
#> X1  35.474194 8.3288175 0.21666472 0.23604112  1 0.9493965
#> X3  16.975269 3.7754840 0.14669513 0.05290687  3 0.9553699
#> X6   9.768280 2.1002510 0.19916932 0.07219986  6 0.8961336
#> X12  5.841129 1.2498387 0.22621979 0.09989340 12 0.7004706
#> X24  3.438306 0.7018369 0.17008574 0.07571238 24 0.9425933
#> X48  2.199966 0.4375381 0.07309974 0.05402941 48 0.7310434

IDF curve (1961-1990)

Regressing logarithms the absolute 1nd and 2rd L-moments with the logarithm of duration, it is obtained:



lmom_idf_1981_2010 <- annual.agg.idf.samlmu(x,lmom=attr(para,"lmom"))
####
summary(attr(lmom_idf_1981_2010,"fit"))
#> 
#> Call:
#> glm(formula = out)
#> 
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  3.632275   0.026892  135.07 3.40e-16 ***
#> lm_orderl_2 -1.539229   0.025116  -61.28 4.13e-13 ***
#> log_dd      -0.750579   0.009752  -76.97 5.34e-14 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for gaussian family taken to be 0.001892428)
#> 
#>     Null deviance: 18.335946  on 11  degrees of freedom
#> Residual deviance:  0.017032  on  9  degrees of freedom
#> AIC: -36.636
#> 
#> Number of Fisher Scoring iterations: 2
####
attr(lmom_idf_1981_2010,"gg")
#> `geom_smooth()` using formula = 'y ~ x'

A IDF relations like \(I = a(T) D^{n}\), where \(n\) is estimated to be -0.75058, has been assed:

para_idf_1981_2010 <- annual.agg.pel(distrib="gev",x=x,lmom=lmom_idf_1981_2010)

para_idf_1981_2010$D001
#>          xi       alpha           k 
#> 31.11213584 11.54066183 -0.00456607 
#> attr(,"probability_distrib")
#> [1] "gev"
#> attr(,"dd")
#> D001 
#>    1 
#> attr(,"ks.test")
#> 
#>  Asymptotic one-sample Kolmogorov-Smirnov test
#> 
#> data:  xval
#> D = 0.14275, p-value = 0.5527
#> alternative hypothesis: two-sided
attr(para_idf_1981_2010,"lmom")
#>           l_1       l_2       t_3        t_4 dd   p.value
#> 1   37.825925 8.0333320 0.1728626 0.09679195  1 0.5526599
#> 1.1 16.583346 3.5219105 0.1728626 0.09679195  3 0.8177123
#> 1.2  9.856557 2.0932996 0.1728626 0.09679195  6 0.7880802
#> 1.3  5.858390 1.2441836 0.1728626 0.09679195 12 0.5915897
#> 1.4  3.482021 0.7394989 0.1728626 0.09679195 24 0.9395884
#> 1.5  2.069591 0.4395321 0.1728626 0.09679195 48 0.1042628

p.values of Kolgormov-Smirov’s test present values greater than 0.09. Here is a boxplot of precipitation intensity versus duration with IDF curves for some return periods (e.g. 2,5,10,…,100,… years)(legend reports frequencies/probabilities \(f\) defined as \(f=1- {1 \over T}\) with \(T\) return period):


set.seed(560)
rt <- c(2,5,10,20,50,100,200,500)
f  <- 1-1/rt
##f=c(0.5,0.8,0.9,0.95,0.98,0.99,0.999,0.9999)
#library(RColorBrewer) ## FACTOR (migilioramento grafico)
#library(ggplot2)
#source("/home/ecor/local/rpackages/jrc/lmomIDF/R/annual.agg.qua.R")

out_qua_1981_2010 <- annual.agg.qua(f=f,para=para_idf_1981_2010,remove_distrib_from_boxplot=TRUE)
attr(out_qua_1981_2010,"idf")

And the analogous plot resenting precipitation depth with DDF (Depth-Duration-Frequency) curve :


attr(out_qua_1981_2010,"ddf")

Study Period: 1991-2020

L-moment estimation

Assuming a reference period from 1991 to 2020, L-moments are computed as follows (Hosking (2019)):


x <- ann.maxima %>% filter(as.numeric(index) %in% 1980:2020)
lmom <- annual.agg.samlmu(x)

head(lmom)
#>           l_1       l_2        t_3        t_4 dd
#> X1  34.568293 8.1602441 0.15202064 0.20977548  1
#> X3  17.591057 4.1321139 0.11070719 0.04450216  3
#> X6  10.363008 2.3553862 0.16075287 0.03725842  6
#> X12  6.171748 1.3297155 0.12865233 0.04110647 12
#> X24  3.587703 0.6917022 0.05392307 0.04910377 24
#> X48  2.259019 0.4172815 0.02499268 0.05141197 48

GEV fitting

Through the L-Moments , GEV probability distribution parameters are estimated as follows:


para <- annual.agg.pel(distrib="gev",x=x,lmom=lmom)

para$D001
#>          xi       alpha           k 
#> 27.92569254 12.07395926  0.02805672 
#> attr(,"probability_distrib")
#> [1] "gev"
#> attr(,"dd")
#> D001 
#>    1 
#> attr(,"ks.test")
#> 
#>  Asymptotic one-sample Kolmogorov-Smirnov test
#> 
#> data:  xval
#> D = 0.071804, p-value = 0.9841
#> alternative hypothesis: two-sided
attr(para,"lmom")
#>           l_1       l_2        t_3        t_4 dd   p.value
#> X1  34.568293 8.1602441 0.15202064 0.20977548  1 0.9840814
#> X3  17.591057 4.1321139 0.11070719 0.04450216  3 0.9036491
#> X6  10.363008 2.3553862 0.16075287 0.03725842  6 0.7897398
#> X12  6.171748 1.3297155 0.12865233 0.04110647 12 0.5357254
#> X24  3.587703 0.6917022 0.05392307 0.04910377 24 0.9253937
#> X48  2.259019 0.4172815 0.02499268 0.05141197 48 0.8186521

IDF curve (1991-2020)

Regressing the logarithms the absolute 1nd and 2rd L-moments with the logarithm of duration, it is obtained:



lmom_idf_1991_2020 <- annual.agg.idf.samlmu(x,lmom=attr(para,"lmom"))
####
summary(attr(lmom_idf_1991_2020,"fit"))
#> 
#> Call:
#> glm(formula = out)
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  3.66788    0.05088   72.08 9.62e-14 ***
#> lm_orderl_2 -1.54064    0.04752  -32.42 1.24e-10 ***
#> log_dd      -0.75225    0.01845  -40.77 1.60e-11 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for gaussian family taken to be 0.006775518)
#> 
#>     Null deviance: 18.44279  on 11  degrees of freedom
#> Residual deviance:  0.06098  on  9  degrees of freedom
#> AIC: -21.331
#> 
#> Number of Fisher Scoring iterations: 2
####
attr(lmom_idf_1991_2020,"gg")
#> `geom_smooth()` using formula = 'y ~ x'

A IDF relations like \(I = a(T) D^{n}\), where \(n\) is estimated to be -0.75225, is then assessed:

para_idf_1991_2020 <- annual.agg.pel(distrib="gev",x=x,lmom=lmom_idf_1991_2020)

para_idf_1991_2020$D001
#>         xi      alpha          k 
#> 32.6961143 13.1073900  0.0853927 
#> attr(,"probability_distrib")
#> [1] "gev"
#> attr(,"dd")
#> D001 
#>    1 
#> attr(,"ks.test")
#> 
#>  Asymptotic one-sample Kolmogorov-Smirnov test
#> 
#> data:  xval
#> D = 0.17562, p-value = 0.1594
#> alternative hypothesis: two-sided
attr(para_idf_1991_2020,"lmom")
#>           l_1       l_2       t_3        t_4 dd   p.value
#> 1   39.234231 8.4459392 0.1162076 0.07753191  1 0.1593846
#> 1.1 17.169307 3.6960307 0.1162076 0.07753191  3 0.6127154
#> 1.2 10.193053 2.1942549 0.1162076 0.07753191  6 0.6088332
#> 1.3  6.051399 1.3026825 0.1162076 0.07753191 12 0.3973438
#> 1.4  3.592587 0.7733749 0.1162076 0.07753191 24 0.8451978
#> 1.5  2.132843 0.4591363 0.1162076 0.07753191 48 0.1160387

p.values of Kolgormov-Smirov’s test present values greater than 0.11. Here is a boxplot of precipitation intensity versus duration with IDF curves for some return periods (e.g. 2,5,10,…,100,… years)(legend reports frequencies/probabilities \(f\) defined as \(f=1- {1 \over T}\) with \(T\) return period):


set.seed(560)
rt <- c(2,5,10,20,50,100,200,500)
f  <- 1-1/rt
##f=c(0.5,0.8,0.9,0.95,0.98,0.99,0.999,0.9999)
out_qua_1991_2020 <- annual.agg.qua(f=f,para=para_idf_1991_2020,remove_distrib_from_boxplot=TRUE)
attr(out_qua_1991_2020,"idf")

And the analogous plot resenting precipitation depth with DDF (Depth-Duration-Frequacy) curve :


attr(out_qua_1991_2020,"ddf")

Comparison between 1981-2010 and 1991-2020

Finally, the estimation of the function \(a(T)\), varying with time periods, is affected by the choice of the reference periods: at a fixed return period (or frequency/probability) the quarantines estimating using 1991-2021 is greater than the one using 1981-2010 as reference time window, only in case of longer return periods.


qua_1991_2020 <- out_qua_1991_2020
qua_1981_2010 <- out_qua_1981_2010

attr(out_qua_1981_2010,"n_idf")
#> [1] -0.7505794
attr(out_qua_1991_2020,"n_idf")
#> [1] -0.7522456

qua_1981_2010$a <- qua_1981_2010$aggr/(qua_1981_2010$dd^attr(out_qua_1981_2010,"n_idf"))
qua_1991_2020$a <- qua_1991_2020$aggr/(qua_1991_2020$dd^attr(out_qua_1991_2020,"n_idf"))
#####
qua_1981_2010 <- qua_1981_2010 %>% dplyr::select(-aggr)
qua_1991_2020 <- qua_1991_2020 %>% dplyr::select(-aggr)
#####

names(qua_1981_2010)[names(qua_1981_2010)=="a"] <- "a_1981_2010"
names(qua_1991_2020)[names(qua_1991_2020)=="a"] <- "a_1991_2020"
#####


# > ggplot()+geom_point(data=qua_both,aes(x=aggr_1981_2010,y=aggr_1991_2010))
# Error in FUN(X[[i]], ...) : object 'aggr_1991_2010' not found
# > ggplot()+geom_point(data=qua_both,aes(x=aggr_1981_2010,y=aggr_1991_2020))
# > ggplot()+geom_point(data=qua_both,aes(x=aggr_1981_2010,y=aggr_1991_2020,group=dd))
# > ggplot()+geom_point(data=qua_both,aes(x=aggr_1981_2010,y=aggr_1991_2020,group=dd,fill=dd))
# > ggplot()+geom_point(data=qua_both,aes(x=aggr_1981_2010,y=aggr_1991_2020,group=dd,color=factor(dd)))
# >
#####
qua_both <- full_join(qua_1981_2010,qua_1991_2020)
#> Joining with `by = join_by(f, dd)`

#####
library(ggplot2)
library(RColorBrewer)

gg <- ggplot()+geom_point(data=qua_both,aes(x=a_1981_2010,y=a_1991_2020,color=factor(f)))
gg <- gg+xlab("Quantiles 1981-2010")+ylab("Quantiles 1991-2020")+theme_bw()+geom_abline()
###
col1 <- brewer.pal(name="YlGnBu",n=length(unique(qua_both$f))) %>% col2rgb()
col2 <- brewer.pal(name="YlOrRd",n=length(unique(qua_both$f))) %>% rev() %>% col2rgb()
coll <- (col1*0.3+col2*0.7)/255
##coll[,] <- as.integer(coll[,])
colz <- rgb(red=coll["red",],green=coll["green",],blue=coll["blue",],maxColorValue = 1)

##YlOrRd
gg <- gg+scale_color_manual(name="Probability",values=colz)
gg

Conclusions

Finally, a vignette about a proper use of lmomIDF with hourly precipitation time series package has been shown. The user can make his/her own quantitative assessments of the results and draw on the chunks of code reported to tailor what the package can offer to his/her own needs.

Further Readings on Similar Topic

Similar reading on which outcomes can be compared:

The list and URLs and/or references will be further updated.

References

ARPAL, Regione Liguria. n.d. “ARPAL Liguria Meteo Data.” https://ambientepub.regione.liguria.it/SiraQualMeteo/script/PubAccessoDatiMeteo.asp.
Cordano, Emanuele. 2022. lmomPi: (Precipitation) Frequency Analysis and Variability with l-Moments from ’Lmom’. https://CRAN.R-project.org/package=lmomPi.
Courty, Laurent G, Robert L Wilby, John K Hillier, and Louise J Slater. 2019. “Intensity-Duration-Frequency Curves at the Global Scale.” Environmental Research Letters 14 (8): 084045. https://doi.org/10.1088/1748-9326/ab370a.
Heidari, Hadi, Mazdak Arabi, Mahshid Ghanbari, and Travis Warziniack. 2020. “A Probabilistic Approach for Characterization of Sub-Annual Socioeconomic Drought Intensity-Duration-Frequency (IDF) Relationships in a Changing Environment.” Water 12 (6). https://doi.org/10.3390/w12061522.
Hosking, J. R. M. 2019. L-Moments. https://CRAN.R-project.org/package=lmom.
Koutsoyiannis, Demetris, Demosthenes Kozonis, and Alexandros Manetas. 1998. “A Mathematical Framework for Studying Rainfall Intensity-Duration-Frequency Relationships.” Journal of Hydrology 206 (1): 118–35. https://doi.org/https://doi.org/10.1016/S0022-1694(98)00097-3.
Markiewicz, Iwona. 2021. “Depth&ndash;duration&ndash;frequency Relationship Model of Extreme Precipitation in Flood Risk Assessment in the Upper Vistula Basin.” Water 13 (23). https://doi.org/10.3390/w13233439.
Marsaglia, George, Wai Wan Tsang, and Jingbo Wang. 2003. “Evaluating Kolmogorov’s Distribution.” Journal of Statistical Software 8 (18): 1–4. https://doi.org/10.18637/jss.v008.i18.
Museo Scientifico G. Sanguineti - G. Leonardini, Associazione Amici del. n.d. “Recupero e Digitalizzazione Della Serie Storica Dei Dati Meteo Rilevati Presso l’osservatorio Meteorologico Sismico Del Seminario Vescovile Di Chiavari Negli Anni 1883 – 2014.” https://servizi.comune.chiavari.ge.it/osservatorio/.
Sun, Yabin, Dadiyorto Wendi, Dong Eon Kim, and Shie-Yui Liong. 2019. “Deriving Intensity–Duration–Frequency (IDF) Curves Using Downscaled in Situ Rainfall Assimilated with Remote Sensing Data.” Geoscience Letters 6 (1): 17. https://doi.org/10.1186/s40562-019-0147-x.
Wikipedia contributors. 2022. “Generalized Extreme Value Distribution — Wikipedia, the Free Encyclopedia.” https://en.wikipedia.org/w/index.php?title=Generalized_extreme_value_distribution&oldid=1118970041.